Project Group - 15¶

Members: Can Balkose, Zep Van Boxtel, Stan Vos, Julia Michels

Student numbers: 6068383 , 4903684 , 4725603 , 4996569

Research Objective¶

Requires data modeling and quantitative research in Transport, Infrastructure & Logistics

Research Question:

Effect of COVID on the transportation usage and mode of choice on different regions and demographics in the Netherlands.

Objectives

-To analyze and visualize the impact of the COVID-19 pandemic on transportation usage and mode choice in different regions within the Netherlands.

-To provide insights into how government policies, and public sentiment influenced transportation trends during the pandemic.

-What were the key demographic factors influencing transportation mode choice during the pandemic?

-Understanding how urban and rural cities were affected differently from the pandemic on transportation usage and mode of transportation

-To understand the change of behavior in different demographics on transportation after the pandemic. Coming up with a conclusion on the potential long-term impacts on transportation behavior post-pandemic

Contribution Statement¶

Be specific. Some of the tasks can be coding (expect everyone to do this), background research, conceptualisation, visualisation, data analysis, data modelling

Author 1:

Author 2:

Author 3:

Data Used¶

In [2]:
import pandas as pd
from scipy.signal import find_peaks
from scipy.signal import argrelextrema
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
In [3]:
#data of distance covered in regions based on urbanization of region and mode of transport
data_distance_mode_urban= 'data/Distance_covered_on_different_urban_areas.csv'
df_urbanization_mode_urban = pd.read_csv(data_distance_mode_urban)
df_urbanization_mode_urban
Out[3]:
Year Region Mode of Transport Total Distance (billion km)
0 2018 Extremely urbanized Combined 46.8
1 2019 Extremely urbanized Combined 46.2
2 2020 Extremely urbanized Combined 31.4
3 2021 Extremely urbanized Combined 36.3
4 2022 Extremely urbanized Combined 40.9
... ... ... ... ...
195 2018 Not urbanized Other 2.0
196 2019 Not urbanized Other 2.1
197 2020 Not urbanized Other 1.5
198 2021 Not urbanized Other 1.4
199 2022 Not urbanized Other 1.3

200 rows × 4 columns

In [4]:
#data of usage of public transportation in different demographics
data_usage_of_public_transport= 'data/Usage_of_public_transportation.csv'
df_usage_of_public_transport = pd.read_csv(data_usage_of_public_transport)
df_usage_of_public_transport
Out[4]:
Demographic Year Usage of public transportation (%)
0 Age: 12 to 17 years 2018 11.7
1 Age: 12 to 17 years 2019 10.7
2 Age: 12 to 17 years 2020 6.3
3 Age: 12 to 17 years 2021 6.3
4 Age: 12 to 17 years 2022 9.2
... ... ... ...
100 No driver's license; 17 years or older 2018 17.5
101 No driver's license; 17 years or older 2019 16.3
102 No driver's license; 17 years or older 2020 8.5
103 No driver's license; 17 years or older 2021 9.8
104 No driver's license; 17 years or older 2022 13.7

105 rows × 3 columns

In [16]:
#the amount of traffix on dutch highway on weekdays and weekends compared to 2019 (2019 = 100)
data_traffic_highways = 'data/CBS Dutch highway traffic.csv'
df_data_traffic_highways = pd.read_csv(data_traffic_highways)
df_data_traffic_highways = df_data_traffic_highways.iloc[:-3]
df_data_traffic_highways 
Out[16]:
Week Doordeweeks, 2020 (2019 = 100) In het weekeinde, 2020 (2019 = 100) Doordeweeks, 2021 (2019 = 100) In het weekeinde, 2021 (2019 = 100) Doordeweeks, 2022 (2019 = 100) In het weekeinde, 2022 (2019 = 100) Doordeweeks, 2023 (2019 = 100) In het weekeinde, 2023 (2019 = 100)
0 1 83.0 101.0 71.0 67.0 96.0 82.0 103.0 99.0
1 2 99.0 102.0 79.0 64.0 86.0 86.0 93.0 98.0
2 3 100.0 102.0 77.0 65.0 85.0 84.0 91.0 95.0
3 4 104.0 106.0 78.0 67.0 88.0 91.0 98.0 103.0
4 5 102.0 103.0 78.0 48.0 87.0 86.0 95.0 100.0
5 6 99.0 88.0 62.0 61.0 87.0 91.0 92.0 99.0
6 7 97.0 90.0 73.0 68.0 82.0 82.0 93.0 85.0
7 8 99.0 87.0 80.0 68.0 88.0 89.0 92.0 95.0
8 9 94.0 105.0 78.0 74.0 86.0 92.0 91.0 104.0
9 10 98.0 99.0 80.0 63.0 88.0 89.0 93.0 96.0
10 11 91.0 67.0 80.0 71.0 88.0 91.0 94.0 99.0
11 12 60.0 38.0 80.0 68.0 89.0 92.0 93.0 91.0
12 13 51.0 33.0 80.0 67.0 87.0 85.0 92.0 93.0
13 14 52.0 33.0 77.0 60.0 88.0 84.0 95.0 88.0
14 15 52.0 35.0 76.0 65.0 90.0 87.0 89.0 94.0
15 16 47.0 39.0 76.0 67.0 86.0 92.0 89.0 92.0
16 17 58.0 53.0 75.0 85.0 84.0 107.0 85.0 113.0
17 18 56.0 49.0 81.0 80.0 89.0 103.0 91.0 99.0
18 19 61.0 57.0 77.0 75.0 93.0 93.0 94.0 96.0
19 20 66.0 57.0 83.0 74.0 89.0 91.0 87.0 100.0
20 21 64.0 61.0 78.0 80.0 86.0 97.0 93.0 96.0
21 22 78.0 62.0 89.0 75.0 98.0 83.0 97.0 87.0
22 23 73.0 70.0 84.0 88.0 88.0 98.0 93.0 104.0
23 24 81.0 73.0 87.0 84.0 94.0 95.0 94.0 96.0
24 25 81.0 82.0 86.0 84.0 90.0 92.0 91.0 97.0
25 26 84.0 81.0 87.0 88.0 91.0 93.0 91.0 93.0
26 27 86.0 84.0 88.0 90.0 90.0 93.0 91.0 95.0
27 28 86.0 89.0 86.0 87.0 91.0 94.0 93.0 95.0
28 29 89.0 95.0 88.0 88.0 89.0 94.0 92.0 95.0
29 30 93.0 95.0 89.0 88.0 93.0 97.0 NaN NaN
30 31 93.0 91.0 88.0 87.0 92.0 93.0 NaN NaN
31 32 91.0 90.0 90.0 94.0 91.0 93.0 NaN NaN
32 33 88.0 91.0 90.0 96.0 90.0 98.0 NaN NaN
33 34 90.0 84.0 91.0 87.0 92.0 92.0 NaN NaN
34 35 90.0 86.0 91.0 90.0 94.0 91.0 NaN NaN
35 36 92.0 92.0 94.0 90.0 94.0 93.0 NaN NaN
36 37 90.0 92.0 92.0 96.0 92.0 89.0 NaN NaN
37 38 92.0 89.0 94.0 93.0 91.0 91.0 NaN NaN
38 39 89.0 74.0 93.0 93.0 91.0 80.0 NaN NaN
39 40 88.0 75.0 96.0 97.0 96.0 94.0 NaN NaN
40 41 82.0 76.0 92.0 98.0 91.0 93.0 NaN NaN
41 42 81.0 70.0 93.0 98.0 91.0 94.0 NaN NaN
42 43 79.0 68.0 92.0 90.0 94.0 97.0 NaN NaN
43 44 77.0 66.0 90.0 85.0 91.0 86.0 NaN NaN
44 45 78.0 68.0 90.0 83.0 93.0 94.0 NaN NaN
45 46 79.0 69.0 85.0 83.0 94.0 96.0 NaN NaN
46 47 81.0 72.0 86.0 80.0 93.0 97.0 NaN NaN
47 48 82.0 75.0 81.0 76.0 91.0 88.0 NaN NaN
48 49 83.0 73.0 85.0 80.0 92.0 95.0 NaN NaN
49 50 82.0 72.0 85.0 79.0 92.0 90.0 NaN NaN
50 51 76.0 63.0 78.0 72.0 87.0 79.0 NaN NaN
51 52 83.0 59.0 89.0 61.0 105.0 96.0 NaN NaN
52 53 86.0 65.0 NaN NaN NaN NaN NaN NaN

Mobility; per person, personal characteristics, travel purposes and regions

https://opendata.cbs.nl/statline/#/CBS/en/dataset/84687ENG/table?dl=97AA7

This data is yet to be added

Data Pipeline¶

In [8]:
#filter out the rows where mode of transport is 'combined'
filtered_df_urbanization_mode_urban = df_urbanization_mode_urban[df_urbanization_mode_urban["Mode of Transport"] != 'Combined']
In [9]:
#pie chart to visualise the distance covered by mode of transport per year
years_to_visualize = [2018, 2019, 2020, 2021, 2022]
for year in years_to_visualize:
    df_year = filtered_df_urbanization_mode_urban[filtered_df_urbanization_mode_urban['Year'] == year]

    mode_distance = df_year.groupby('Mode of Transport')['Total Distance (billion km)'].sum()

    plt.figure(figsize=(3, 3))
    plt.pie(mode_distance, labels=mode_distance.index, autopct='%1.1f%%', startangle=140)
    plt.title(f'Distance Covered by Mode of Transport in {year}')
    plt.axis('equal')  
    plt.show()
In [10]:
# For the data for the Equivalised income groups

filtered_income_df_usage_of_public_transport = df_usage_of_public_transport[df_usage_of_public_transport['Demographic'].str.contains('Equ')]
In [11]:
fig = px.bar(
    filtered_income_df_usage_of_public_transport,
    x="Demographic",
    y="Usage of public transportation (%)",
    color='Demographic',  
    animation_frame="Year",
    range_y=[0, 20],
    title="Usage of Public Transportation Over Years",
    labels={"Usage of public transportation (%)": "Usage (%)"},

)
fig.update_xaxes(categoryorder='total descending')

fig.show()
In [18]:
filtered_driver_license_df_usage_of_public_transport = df_usage_of_public_transport[df_usage_of_public_transport['Demographic'].str.contains('river')]

fig = px.line(
    filtered_driver_license_df_usage_of_public_transport,
    x="Year",
    y="Usage of public transportation (%)",
    color="Demographic",
    title="Usage of Public Transportation Over Years by Driver License and Car Ownership",
    labels={"Usage of public transportation (%)": "Usage (%)"},
    markers=True
)
desired_years = [2018, 2019, 2020, 2021, 2022]
years = [str(year) for year in desired_years]

fig.update_xaxes(tickvals=years,ticktext=years)

fig.show()
In [17]:
sns.set_style("whitegrid")

plt.figure(figsize=(12, 6))

for column in df_data_traffic_highways.columns[1:]:
    sns.lineplot(x="Week", y=column, data=df_data_traffic_highways, label=column)

plt.legend(loc="upper right")
plt.xlabel("Week")
plt.ylabel("Value (2019 = 100)")
plt.title("Traffic Data Over Weeks")

plt.show()
In [ ]: